Skip to main content
Log in

A Transportation \(L^p\) Distance for Signal Analysis

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

Transport-based distances, such as the Wasserstein distance and earth mover’s distance, have been shown to be an effective tool in signal and image analysis. The success of transport-based distances is in part due to their Lagrangian nature which allows it to capture the important variations in many signal classes. However, these distances require the signal to be non-negative and normalised. Furthermore, the signals are considered as measures and compared by redistributing (transporting) them, which does not directly take into account the signal intensity. Here, we study a transport-based distance, called the \(TL^p\) distance, that combines Lagrangian and intensity modelling and is directly applicable to general, non-positive and multichannelled signals. The distance can be computed by existing numerical methods. We give an overview of the basic properties of this distance and applications to classification, with multichannelled non-positive one-dimensional signals and two-dimensional images, and colour transfer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Åström, F., Petra, S., Schmitzer, B., Schnörr, C.: Image labeling by assignment. arXiv:1603.05285 (2016)

  2. Aswolinskiy, W., Reinhart, R.F., Steil, J.: Artificial Neural Networks in Pattern Recognition: 7th IAPR TC3 Workshop, ANNPR 2016, Ulm, Germany, September 28–30, 2016, Proceedings, chapter Time Series Classification in Reservoir- and Model-Space: A Comparison, pp. 197–208. Springer International Publishing (2016)

  3. Basu, S., Kolouri, S., Rohde, G.K.: Detecting and visualizing cell phenotype differences from microscopy images using transport-based morphometry. Proc. Natl. Acad. Sci. 111(9), 3448–3453 (2014)

    Article  Google Scholar 

  4. Benamou, J.-D., Brenier, Y.: A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem. Numer. Math. 84(3), 375–393 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  5. Benamou, J.-D., Carlier, G., Cuturi, M., Nenna, L., Peyré, G.: Iterative bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37(2), A1111–A1138 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Birkhoff, G.: Three observations on linear algebra. Univ. Nac. Tacum án Rev. Ser. A 5, 147–151 (1946)

    MathSciNet  MATH  Google Scholar 

  7. Bonneel, N., Rabin, J., Peyré, G., Pfister, H.: Sliced and Radon Wasserstein barycenters of measures. J. Math. Imaging Vis. 51(1), 22–45 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bonneel, N., Van De Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using Lagrangian mass transport. In: ACM Transactions on Graphics (TOG)-Proceedings of ACM SIGGRAPH Asia 2011, vol. 30 (2011)

  9. Brenier, Y., Frisch, U., Hénon, M., Loeper, G., Matarrese, S., Mohayaee, R., Sobolevski, A.: Reconstruction of the early universe as a convex optimization problem. Mon. Not. R. Astron. Soc. 346(2), 501–524 (2003)

    Article  Google Scholar 

  10. Chen, J., Paris, S., Durand, F.: Real-time edge-aware image processing with the bilateral grid. In: ACM SIGGRAPH 2007 Papers, SIGGRAPH ’07. ACM (2007)

  11. Chen, Y., Georgiou, T.T., Tannenbaum, A.: Matrix optimal mass transport: a quantum mechanical approach. arXiv:1610.03041 (2016)

  12. Chen, Y., Georgiou, T.T., Tannenbaum, A.: Interpolation of matrices and matrix-valued measures: the unbalanced case. arXiv:1612.05914 (2017)

  13. Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.-X.: An interpolating distance between optimal transport and Fisher–Rao metrics. Found. Comput. Math. doi:10.1007/s10208-016-9331-y

  14. Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.-X.: Scaling algorithms for unbalanced optimal transport problems. arXiv:1607.05816 (2016)

  15. Courty, N., Flamary, R., Tuia, D.: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15–19, 2014. In: Proceedings, Part I, chapter Domain Adaptation with Regularized Optimal Transport, pp. 274–289. Springer, Berlin (2014)

  16. Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Proceeding NIPS’13 Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 2292–2300 (2013)

  17. Ferradans, S., Papadakis, N., Rabin, J., Peyré, G., Aujol, J.-F.: Scale Space and Variational Methods in Computer Vision: 4th International Conference, SSVM 2013, Schloss Seggau, Leibnitz, Austria, June 2–6, 2013. In: Proceedings, chapter Regularized Discrete Optimal Transport, pp. 428–439. Springer, Berlin (2013)

  18. Frisch, U., Matarrese, S., Mohayaee, R., Sobolevski, A.: A reconstruction of the initial conditions of the universe by optimal mass transportation. Nature 417(6886), 260–262 (2002)

    Article  Google Scholar 

  19. Frisch, U., Sobolevskii, A.: Application of optimal transportation theory to the reconstruction of the early universe. J. Math. Sci. (New York) 133(1), 303–309 (2004)

    MathSciNet  Google Scholar 

  20. Frogner, C., Zhang, C., Mobahi, H., Araya-Polo, M., Poggio, T.A.: Learning with a Wasserstein loss. In: Proceeding NIPS’15 Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 2053–2061 (2015)

  21. García Trillos, N., Slepčev, D.: Continuum limit of Total Variation on point clouds. Arch. Rational Mech. Anal. 220(1), 193–241 (2016)

  22. Górecki, T., Łuczak, M.: Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst. Appl. 42(5), 2305–2312 (2015)

    Article  Google Scholar 

  23. Grenander, U., Miller, M.I.: Computational anatomy: an emerging discipline. Q. Appl. Math. LVI(4), 617–694 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  24. Haber, E., Rehman, T., Tannenbaum, A.: An efficient numerical method for the solution of the \(L_2\) optimal mass transfer problem. SIAM J. Sci. Comput. 32(1), 197–211 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  25. Haker, S., Tannenbaum, A.: On the Monge–Kantorovich problem and image warping. IMA Vol. Math. Appl. 133, 65–86 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  26. Haker, S., Tannenbaum, A., Kikinis, R.: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2001: 4th International Conference Utrecht, The Netherlands, October 14–17, 2001 Proceedings, chapter Mass Preserving Mappings and Image Registration, pp. 120–127. Springer, Berlin (2001)

  27. Haker, S., Zhu, L., Tannenbaum, A., Angenent, S.: Optimal mass transport for registration and warping. Int. J. Comput. Vis. 60(3), 225–240 (2004)

    Article  Google Scholar 

  28. Hug, R., Maitre, E., Papadakis, N.: Multi-physics optimal transportation and image interpolation. ESAIM. Math. Model. Numer. Anal. 49(6), 1671–1692 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  29. Joshi, S.C., Miller, M.I.: Landmark matching via large deformation diffeomorphisms. IEEE Trans. Image Process. 9(8), 1357–1370 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  30. Kadous, M.W.: Temporal classification: extending the classification paradigm to multivariate time series. PhD thesis, The University of New South Wales (2002)

  31. Kantorovich, L.V.: On the translocation of masses. Dokl. Akad. Nauk SSSR 37, 199–201 (1942)

    MathSciNet  MATH  Google Scholar 

  32. Kantorovich, L.V.: A problem of Monge. Uspekhi Mat. Nauk. 3(24), 225–226 (1948)

    Google Scholar 

  33. Khan, A.M., Rajpoot, N., Treanor, D., Magee, D.: A nonlinear mapping approach to stain normalization in digital histopathology images using image-specific color deconvolution. IEEE Trans. Biomed. Eng. 61(6), 1729–1738 (2014)

    Article  Google Scholar 

  34. Kolouri, S., Park, S., Rohde, G.K.: The Radon cumulative distribution transform and its application to image classification. IEEE Trans. Image Process. 25(2), 920–934 (2016)

    Article  MathSciNet  Google Scholar 

  35. Kolouri, S., Park, S., Thorpe, M., Slepčev, D., Rohde, G.K.: Transport-based analysis, modeling, and learning from signal and data distributions. arXiv:1609.04767 (2016)

  36. Kolouri, S., Rohde, G.K.: Transport-based single frame super resolution of very low resolution face images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4876–4884 (2015)

  37. Kolouri, S., Tosun, A.B., Ozolek, J.A., Rohde, G.K.: A continuous linear optimal transport approach for pattern analysis in image datasets. Pattern Recognit. 51, 453–462 (2016)

    Article  Google Scholar 

  38. Kondratyev, S., Monsaingeon, L., Vorotnikov, D.: A new optimal transport distance on the space of finite radon measures. Adv. Differ. Equ. 21(11–12), 1117–1164 (2016)

    MathSciNet  MATH  Google Scholar 

  39. Kruskal, J.B.: Nonmetric multidimensional scaling: a numerical method. Psychometrika 29(2), 115–129 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  40. Lellmann, J., Lorenz, D.A., Schönlieb, C., Valkonen, T.: Imaging with Kantorovich–Rubinstein discrepancy. SIAM J. Imaging Sci. 7(4), 2833–2859 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  41. Lichman, M.: UCI machine learning repository (2013)

  42. Liero, M., Mielke, A., Savaré, G.: Optimal transport in competition with reaction: the hellinger–kantorovich distance and geodesic curves. arXiv:1509.00068 (2015)

  43. Liero, M., Mielke, A., Savaré, G.: Optimal entropy-transport problems and a new Hellinger–Kantorovich distance between positive measures. arXiv:1508.07941 (2016)

  44. Lipman, Y., Daubechies, I.: Conformal Wasserstein distances: comparing surfaces in polynomial time. Adv. Math. 227(3), 1047–1077 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  45. Magee, D., Treanor, D., Crellin, D., Shires, M., Smith, K., Mohee, K., Quirke, P.: Colour normalisation in digital histopathology images. In: Proceedings of Optical Tissue Image analysis in Microscopy, Histopathology and Endoscopy (MICCAI Workshop), vol. 100 (2009)

  46. Mérigot, Q.: A multiscale approach to optimal transport. Comput. Graph. Forum 30(5), 1583–1592 (2011)

    Article  Google Scholar 

  47. Monge, G.: Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences, 666–704 (1781)

  48. Montavon, G., Müller, K.-R., Cuturi, M.: Wasserstein training of Boltzmann machines. arXiv:1507.01972 (2015)

  49. Morovic, J., Sun, P.-L.: Accurate 3d image colour histogram transformation. Pattern Recognit. Lett. 24(11), 1725–1735 (2003)

    Article  Google Scholar 

  50. Museyko, O., Stiglmayr, M., Klamroth, K., Leugering, G.: On the application of the Monge–Kantorovich problem to image registration. SIAM J. Imaging Sci. 2(4), 1068–1097 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  51. Nikolova, M., Wen, Y.-W., Chan, R.: Exact histogram specification for digital images using a variational approach. J. Math. Imaging Vis. 46(3), 309–325 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  52. Ning, L., Georgiou, T.T., Tannenbaum, A.: Matrix-valued monge–kantorovich optimal mass transport. In: 52nd IEEE Conference on Decision and Control, pp. 3906–3911 (2013)

  53. Oberman, A.M., Ruan, Y.: An efficient linear programming method for optimal transportation. arXiv:1509.03668 (2015)

  54. Oudre, L., Jakubowicz, J., Bianchi, P., Simon, C.: Classification of periodic activities using the Wasserstein distance. IEEE Trans. Biomed. Eng. 59(6), 1610–1619 (2012)

    Article  Google Scholar 

  55. Ozolek, J.A., Tosun, A.B., Wang, W., Chen, C., Kolouri, S., Basu, S., Huang, H., Rohde, G.K.: Accurate diagnosis of thyroid follicular lesions from nuclear morphology using supervised learning. Med. Image Anal. 18(5), 772–780 (2014)

    Article  Google Scholar 

  56. Papadakis, N., Bugeau, A., Caselles, V.: Image editing with spatiograms transfer. IEEE Trans. Image Process. 21(5), 2513–2522 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  57. Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. Int. J. Comput. Vis. 81(1), 24–52 (2009)

    Article  Google Scholar 

  58. Park, S., Kolouri, S., Kundu, S., Rohde, G.: The cumulative distribution transform and linear pattern classification. arXiv:1507.05936 (2015)

  59. Pele, O., Werman, M.: A linear time histogram metric for improved sift matching. In: European conference on computer vision, pp. 495–508 (2008)

  60. Pele, O., Werman, M.: Fast and robust Earth Mover’s Distances. In: IEEE 12th International Conference on Computer Vision, pp. 460–467 (2009)

  61. Pitié, F., Kokaram, A.: The linear Monge–Kantorovitch linear colour mapping for example-based colour transfer. In: 4th European Conference on Visual Media Production, pp. 1–9 (2007)

  62. Rabin, J., Ferradans, S., Papadakis, N.: Adaptive color transfer with relaxed optimal transport. In: IEEE International Conference on Image Processing (ICIP), pp. 4852–4856 (2014)

  63. Rabin, J., Papadakis, N.: Geometric Science of Information: Second International Conference, GSI 2015, Palaiseau, France, October 28–30, 2015, Proceedings, chapter Non-convex Relaxation of Optimal Transport for Color Transfer Between Images, pp. 87–95. Springer International Publishing (2015)

  64. Rabin, J., Peyré, G.: Wasserstein regularization of imaging problem. In: 18th IEEE International Conference on Image Processing (ICIP), pp. 1521–1544 (2011)

  65. Rabin, J., Peyré, G., Cohen, L.D.: Computer Vision—ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part V, chapter Geodesic Shape Retrieval via Optimal Mass Transport, pp. 771–784. Springer, Berlin (2010)

  66. Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graph. Appl. 21(5), 34–41 (2001)

    Article  Google Scholar 

  67. Rockafellar, R.T.: Convex Analysis. Princeton University Press, princeton (1970)

    Book  MATH  Google Scholar 

  68. Rubner, Y., Tomasi, C., Guibas, L.J.: The Earth Mover’s Distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)

    Article  MATH  Google Scholar 

  69. Russell, E.J.: Letters to the editor-extension of Dantzig’s algorithm to finding an initial near-optimal basis for the transportation problem. Oper. Res. 17(1), 187–191 (1969)

    Article  MATH  Google Scholar 

  70. Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, pp. 138–142 (1994)

  71. Schmitzer, B.: Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31–June 4, 2015, Proceedings, chapter A sparse algorithm for dense optimal transport, pp. 629–641. Springer International Publishing (2015)

  72. Schmitzer, B.: Stabilized sparse scaling algorithms for entropy regularized transport problems. arXiv:1610.06519 (2016)

  73. Shinohara, R.T., Sweeney, E.M., Goldsmith, J., Shiee, N., Mateen, F.J., Calabresi, P.A., Jarso, S., Pham, D.L., Reich, D.S., Crainiceanu, C.M.: Statistical normalization techniques for magnetic resonance imaging. NeuroImage Clin. 6, 9–19 (2014)

    Article  Google Scholar 

  74. Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional Wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph. 34(4), 66:1–66:11 (2015)

    Article  MATH  Google Scholar 

  75. Solomon, J., Rustamov, R., Guibas, L., Butscher, A.: Earth mover’s distances on discrete surfaces. ACM Trans. Graph. (TOG) 33(4), 67 (2014)

    Article  Google Scholar 

  76. Solomon, J., Rustamov, R., Leonidas, G., Butscher, A.: Wasserstein propagation for semi-supervised learning. In: Jebara, T., Xing, E.P. (eds. ) Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 306–214. JMLR Workshop and Conference Proceedings (2014)

  77. Su, Z., Zeng, W., Wang, Y., Lu, Z.-L., Gu, X.: Shape classification using Wasserstein distance for brain morphometry analysis. Inf. Process. Med. Imaging 24, 411–423 (2015)

    Google Scholar 

  78. Tannenbaum, E., Georgiou, T., Tannenbaum, A.: Signals and control aspects of optimal mass transport and the Boltzmann entropy. In: 49th IEEE Conference on Decision and Control (CDC), pp. 1885–1890 (2010)

  79. Thorpe, M., Slepčev, D.: Transportation \(L^p\) distances: properties and extensions. In Preparation (2016)

  80. Tosun, A.B., Yergiyev, O., Kolouri, S., Silverman, J.F., Rohde, G.K.: Novel computer-aided diagnosis of mesothelioma using nuclear structure of mesothelial cells in effusion cytology specimens. In: Proceedings of SPIE, vol. 9041, pp. 90410Z–90410Z–6 (2014)

  81. Tosun, A.B., Yergiyev, O., Kolouri, S., Silverman, J.F., Rohde, G.K.: Detection of malignant mesothelioma using nuclear structure of mesothelial cells in effusion cytology specimens. Cytom. Part A 87(4), 326–333 (2015)

    Article  Google Scholar 

  82. ur Rehman, T., Haber, E., Pryor, G., Melonakos, J., Tannenbaum, A.: 3D nonrigid registration via optimal mass transport on the GPU. Med. Image Anal. 13(6), 931–940 (2009)

    Article  Google Scholar 

  83. Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics. American Mathematical Society, Providence (2003)

  84. Villani, C.: Optimal Transport: Old and New. Springer, Berlin (2008)

    MATH  Google Scholar 

  85. Wang, W., Slepčev, D., Basu, S., Ozolek, J.A., Rohde, G.K.: A linear optimal transportation framework for quantifying and visualizing variations in sets of images. Int. J. Comput. Vis. 101(2), 254–269 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  86. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2003)

    Article  Google Scholar 

  87. Weinberger, K.Q., Chapelle, O.: Large margin taxonomy embedding for document categorization. In: Advances in Neural Information Processing Systems, pp. 1737–1744 (2009)

  88. Zhu, L., Haker, S., Tannenbaum, A.: Flattening maps for the visualization of multibranched vessels. IEEE Trans. Med. Imaging 24(2), 191–198 (2005)

    Article  Google Scholar 

  89. Zhu, L., Yang, Y., Haker, S., Tannenbaum, A.: An image morphing technique based on optimal mass preserving mapping. IEEE Trans. Image Process. 16(6), 1481–1495 (2007)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Authors gratefully acknowledge funding from the NSF (CCF 1421502) and the NIH (GM090033, CA188938) in contributing to a portion of this work. DS also acknowledges funding by NSF (DMS-1516677). The authors are grateful to the Center for Nonlinear Analysis at CMU for its support. In addition, the authors would like to thank the referees for their valuable comments that lead to significant improvements in the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthew Thorpe.

Appendices

Appendix 1: Performance of \(TL^p_\lambda \) in Classification Problems with Simple and Oscillatory Signals

We compare the performance of \(TL_\lambda ^2\), \(L^2\) and OT distances with respect to classification/clustering for the three classes \(\{{\mathcal {C}}_i\}_{i=1,2,3}\) of signals defined in Fig. 4. We test how each distance performs by finding the smallest number of data points such that the classes \({\mathcal {C}}_i^N=\{f_i\}_{i=1}^N\subset {\mathcal {C}}_i\) are separable. For sufficiently large N the approximation \(d_{H,\rho }({\mathcal {C}}_i^N,{\mathcal {C}}_j^N) \approx d_{H,\rho }({\mathcal {C}}_i,{\mathcal {C}}_j)\) is used to simplify the computation. Similarly, as a proxy for \({\mathbb {E}}R_\rho ({\mathcal {C}}_i^N)\) we use \(R_\rho (\hat{{\mathcal {C}}}^N)\) where

$$\begin{aligned} \hat{{\mathcal {C}}}_i^N = \Bigg \{ f_\ell \, : \,&\ell = \ell _{\min }^i + \frac{n-1}{N-1} \left( \ell _{\min }^i-\ell _{\max }^i \right) , \\&\quad n\in \{1,2,\dots , N\} \Bigg \} \end{aligned}$$

is the uniform sample from class \({\mathcal {C}}_i\) (recall that class \({\mathcal {C}}_i\) is parameterized by \(\ell \in [\ell _{\min }^i,\ell _{\max }^i]\) and with an abuse of notation we use the subscript of \(f_\ell \) to denote the dependence of \(\ell \)).

It follows that the class separation distances and class coverage radius are approximated by

$$\begin{aligned} d_{H,L^2}^2({\mathcal {C}}_1^N,{\mathcal {C}}_2^N)&\approx \frac{\alpha }{2}&R^2_{L^2}({\mathcal {C}}_1^N)&\approx \frac{2}{N} \\ d_{H,L^2}^2({\mathcal {C}}_1^N,{\mathcal {C}}_3^N)&\approx \frac{3\alpha }{4}&R^2_{L^2}({\mathcal {C}}_2^N)&\approx \frac{1}{N} \\ d_{H,L^2}^2({\mathcal {C}}_2^N,{\mathcal {C}}_3^N)&\approx \frac{\alpha }{4}&R^2_{L^2}({\mathcal {C}}_3^N)&\approx \frac{2\alpha }{N\gamma } \\ d_{H,\mathrm {OT}}^2({\mathcal {C}}_1^N,{\mathcal {C}}_2^N)&\approx \frac{\beta ^2\alpha }{4}&R^2_{\mathrm {OT}}({\mathcal {C}}_1^N)&\approx \frac{\alpha }{N^2} \\ d_{H,\mathrm {OT}}^2({\mathcal {C}}_1^N,{\mathcal {C}}_3^N)&\approx \frac{\beta ^2\alpha }{4}&R^2_{\mathrm {OT}}({\mathcal {C}}_2^N)&\approx \frac{\alpha }{N^2} \\ d_{H,\mathrm {OT}}^2({\mathcal {C}}_2^N,{\mathcal {C}}_3^N)&\approx \frac{\alpha \gamma ^2}{8}&R^2_{\mathrm {OT}}({\mathcal {C}}_3^N)&\approx \frac{\alpha }{N^2} \\ d_{H,TL^2_\lambda }^2({\mathcal {C}}_1^N,{\mathcal {C}}_2^N)&\approx \frac{\alpha }{2}&R^2_{TL^2_\lambda }({\mathcal {C}}_1^N)&\approx \frac{\alpha ^2}{N} \end{aligned}$$
$$\begin{aligned} d_{H,TL^2_\lambda }^2({\mathcal {C}}_1^N,{\mathcal {C}}_3^N)&\approx \frac{3\alpha }{4}&R^2_{TL^2_\lambda }({\mathcal {C}}_2^N)&\approx \frac{4\alpha ^2}{N} \\ d_{H,TL^2_\lambda }^2({\mathcal {C}}_2^N,{\mathcal {C}}_3^N)&\approx \frac{\alpha }{4}&R^2_{TL^2_\lambda }({\mathcal {C}}_3^N)&\approx \frac{\alpha ^2}{N}. \end{aligned}$$

We have

$$\begin{aligned} \kappa _{12}^2(L^2;N)&\approx \frac{\alpha N}{4},&\kappa ^2_{13}(L^2;N)&\approx \frac{3\gamma N}{8}, \\ \kappa ^2_{12}(\mathrm {OT};N)&\approx \frac{\beta ^2 N}{4},&\kappa ^2_{13}(\mathrm {OT};N)&\approx \frac{\beta ^2 N^2}{4}, \\ \kappa ^2_{12}(TL_\lambda ^2;N)&\approx \frac{N}{8\alpha },&\kappa ^2_{13}(TL_\lambda ^2;N)&\approx \frac{3N}{4\alpha }, \\ \kappa ^2_{23}(L^2;N)&\approx \frac{\gamma N}{8}, \\ \kappa ^2_{23}(\mathrm {OT};N)&\approx \frac{\gamma ^2 N^2}{8}, \end{aligned}$$
$$\begin{aligned} \kappa ^2_{23}(TL_\lambda ^2;N)&\approx \frac{N}{16 \alpha }. \end{aligned}$$

Finally we can compute \(N^*\),

$$\begin{aligned} N^*_{12}(L^2)&\approx \frac{4}{\alpha },&N^*_{13}(L^2)&\approx \frac{8}{3\gamma },&N^*_{23}(L^2)&\approx \frac{8}{\gamma } \\ N^*_{12}(\mathrm {OT})&\approx \frac{2}{\beta },&N^*_{13}(\mathrm {OT})&\approx \frac{2}{\beta },&N^*_{23}(\mathrm {OT})&\approx \frac{\sqrt{8}}{\gamma } \\ N^*_{12}(TL^2)&\approx \frac{\alpha }{8},&N^*_{13}(TL^2)&\approx \frac{4\alpha }{3},&N^*_{23}(TL^2)&\approx 16 \alpha \end{aligned}$$

which for \(\beta >\frac{\alpha }{2}\), \(\beta >\frac{3\gamma }{4}\) and \(\gamma <\frac{\sqrt{2}\alpha }{8}\) implies the ordering given Sect. 4.1.

Appendix 2: Numerical Methods

In principle any numerical method for computing OT distances capable of dealing with an arbitrary cost function can be adapted to compute the \(TL^p_\lambda \) distance. Here we describe two numerical methods we used in Sect. 4.

1.1 Iterative Linear Programming

Here we describe the iterative linear programming method of Oberman and Ruan [53] which we abbreviate OR. Although this method is not guaranteed to find the minimum in (3) we find it works well in practice and is easier to implement than, for example, methods due to Schmitzer [71] that provably minimise (3) but require a more advanced refinement procedure. See also [46] and references therein for a multiscale descent approach.

The linear programming problem restricted to a subset \({\mathcal {M}}\subseteq \Omega _h^2\) is

figure f

where \(c_\lambda \) is given by (4). When \({\mathcal {M}}=\Omega _h^2\) then the \(TL_\lambda ^p\) distance between \((f_h,\mu _h)\) and \((g_h,\nu _h)\) is the minimum to the above linear programme. Furthermore if \(\pi _h\) is the minimiser in the \(TL_\lambda ^p\) distance then it is also the solution to the linear programme in (\(\hbox {LP}_{h}\)) for any \({\mathcal {M}}\) containing the support of \(\pi _h\). That is if one already knows (or can reasonably estimate) the set of nodes \({\mathcal {M}}\) for which the optimal plan is nonzero then one need only consider the linear programme on \({\mathcal {M}}\). This is advantageous when \({\mathcal {M}}\) is a much smaller set. Motivated by Proposition 3.5 we expect to be able to write the optimal plan as a map. This implies whilst \(\pi _h\) has \(n^2\) unknowns we only expect n of them to be nonzero.

figure g

The method proposed by OR is given in Algorithm 1. An initial discretisation scale \(h_0\) is given and an estimate \(\pi _{h_0}\) found for the linear programme (\(\hbox {LP}_{h}\)) with \({\mathcal {M}} = \Omega _{h_0}^2\). One then iteratively finds \({\mathcal {M}}_r\subseteq \Omega _{h_r}^2\), where \(h_r=\frac{h_{r-1}}{2}\), to be the set of nodes defined by the following refinement procedure. Find the set of nodes for which \(\pi _{h_{r-1}}\) is nonzero, add the neighbouring nodes and then project onto the refined grid \(\Omega _{h_r}^2\). The optimal plan \(\pi _{h_r}\) on \(\Omega _{h_r}^2\) is then estimated by solving the linear programme (\(\hbox {LP}_{h}\)) with \({\mathcal {M}}={\mathcal {M}}_r\).

The grid \(\Omega _{h_r}\) will scale as \((2^{rd}h_0^{-1})^2\). If the linear programme is run N times then at the \(r^\text {th}\) step the linear programme has on the order of \(2^{rd}h_0^{-1}\) variables. In particular on the last (and most expensive) step the number of variables is \(O(2^{Nd}h_0^{-1})\). This compares to size \((2^{Nd}h_0^{-1})^2\) if the linear programme was run on the final grid without this refinement procedure.

1.2 Entropic Regularisation

Cuturi, in the context of computing OT distances, proposed regularizing the minimisation in (3) with entropy [16]. This was further developed by Benamou, Carlier, Cuturi, Nenna and Peyré [5], abbreviated to BCCNP, which is the method we describe here. Instead of considering the distance \(TL^p_\lambda \) we consider

$$\begin{aligned} S_\epsilon = \inf _{\pi \in \Pi (\mu ,\nu )} \left\{ \sum _{i=1}^n \sum _{j=1}^n c_\lambda (x_i,x_j;f,g) \pi _{ij} - \epsilon H(\pi ) \right\} \end{aligned}$$

where \(H(\pi )= - \sum _{i=1}^n \sum _{j=1}^n \pi _{ij} \log \pi _{ij}\) is the entropy. In the OT case the distance \(S_\epsilon \) is also known as the Sinkhorn distance. It is a short calculation to show

$$\begin{aligned} S_\epsilon = \epsilon \inf _{\pi \in \Pi (\mu ,\nu )} \left\{ \mathrm {KL}(\pi |{\mathcal {K}}) \right\} \end{aligned}$$

where \({\mathcal {K}}_{ij} = \exp \left( -\frac{c_\lambda (x_i,x_j:f,g)}{\epsilon }\right) \) (the exponential is taken pointwise) and \(\mathrm {KL}\) is the Kullback-Leibler divergence defined by

$$\begin{aligned} \mathrm {KL}(\pi |{\mathcal {K}}) = \sum _{i=1}^n \sum _{j=1}^n \pi _{ij} \log \left( \frac{\pi _{ij}}{{\mathcal {K}}_{ij}} \right) . \end{aligned}$$

It can be shown that the optimal choice of \(\pi \) for \(S_\epsilon \) can be written in the form \(\pi ^* = \mathrm {diag}(u) {\mathcal {K}} \mathrm {diag}(v)\) where \(u,v\in {\mathbb {R}}^n\) are limits, as \(r \rightarrow \infty \), of the sequence

$$\begin{aligned} v^{(0)} = {\mathbb {I}}, \quad u^{(r)} = \frac{\underline{p}}{{\mathcal {K}} v^{(r)}}, \quad v^{(r+1)} = \frac{\underline{q}}{{\mathcal {K}}^\top u^{(r)}} \end{aligned}$$

and \({\underline{p}} = (p_1,\dots ,p_n)\), \({\underline{q}}=(q_1,\dots ,q_n)\) (multiplication is the usual matrix-vector multiplication, division is pointwise and \(\top \) denotes the matrix transpose). The algorithm given in 2 is a special case of iterative Bregman projections and also known as the Sinkhorn algorithm.

The stopping condition proposed in [16] is to let \(\pi ^{(r)} = \mathrm {diag}(u^{(r)}) {\mathcal {K}} \mathrm {diag}(v^{(r)})\) then stop when

$$\begin{aligned} \left| \frac{\sum _{i,j=1}^n {\mathcal {K}}_{ij} \pi ^{(r)}_{ij} - \epsilon H(\pi ^{(r)})}{\sum _{i,j=1}^n {\mathcal {K}}_{ij} \pi ^{(r-1)}_{ij} - \epsilon H(\pi ^{(r-1)})} - 1 \right| < 10^{-4}. \end{aligned}$$

Note that although as \(\epsilon \rightarrow 0\) we will recover the unregularised \(TL_\lambda ^p\) distance we also suffer numerical instability as \({\mathcal {K}}\rightarrow 0\) exponentially in \(\epsilon \). These instabilities have been addressed in, for example, [14, 72].

For OT with quadratic cost \(c(x,y) = |x-y|^2\) the Sinkhorn algorithm can be more efficiently implemented using Gaussian convolutions [74]. The two numerical methods described so far use the formulation of \(TL^2_\lambda \) given by (3, 4) which interprets the \(TL^2_\lambda \) as an OT distance between measures \(\mu \) and \(\nu \) for a (non-quadratic) cost function \(c_\lambda (\cdot ,\cdot ;f,g)\), hence one cannot make use of previous OT methods such as [74].

However, we also recall that we can define the \(TL^2_\lambda \) distance as the OT distance between measures \((f\times \mathrm {Id})_{\#} \mu \) and \((g\times \mathrm {Id})_{\#} \nu \), see (5), in which case the entropy regularized approach can be implemented using Gaussian convolutions in dimension \(d+m\) (when \(p=2\)), where \(f:\Omega \subseteq {\mathbb {R}}^d\rightarrow {\mathbb {R}}^m\). Although this means that the numerical method is based in a higher dimension we note the success of the bilateral grid method for bilateral filters that are also based on computing a Gaussian filter in a higher dimension [10, 57]. For colour images, where \(m=3\) this approach may not be efficient however for \(m=1\), these ideas have the potential for an improved algorithm.

figure h

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thorpe, M., Park, S., Kolouri, S. et al. A Transportation \(L^p\) Distance for Signal Analysis. J Math Imaging Vis 59, 187–210 (2017). https://doi.org/10.1007/s10851-017-0726-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-017-0726-4

Keywords

Navigation